Data structures and methods for imaging computer readable media

ABSTRACT

Data structures and methods for imaging computer readable media are provided. A data structure used to recreate computer readable media is provided having a type identifier used to identify a type of image data originally imaged. Further, a length data identifies the size of a data stream which is a portion of a file associated with the image data, wherein the data stream combined with additional data streams are operable to form a complete file.  
     Moreover, a method of collecting image data is provided wherein control data is associated with a file residing in the image data, the control data includes a partition type, compression type, file length, and one or more file set types. Furthermore, the control data are organized on a computer readable medium, along with the file.  
     Also, a method of customizing the creation of an image on a computer readable medium is provided wherein zero or more exclude files in an image source are identified and are not included in a created image target. The created image target includes one or more files which are identified as files to retain from the image source.

FIELD OF THE INVENTION

[0001] The present invention relates to data structures and methods forimaging computer readable media.

BACKGROUND OF THE INVENTION

[0002] Providing backup data to a hard drive of a computing device is ofvital importance in today's information age. Individuals have become sodependent upon using their computing device, that they have become nonfunctional when the device is inoperable. When a computing device startsup, or is booted up, it uses a hard drive memory associated with thecomputing device to provide all the necessary data files and dataapplication programs to the user of the device. Some of these files mayreside locally on the computing device, while others are providedremotely to the computing device if the computing device is networkedwith other devices providing files and applications.

[0003] A standard process, referred to as imaging a hard drive, takes asnapshot of a hard drive often referred to as a source image associatedwith an originating computing device. This allows the source image datato be stored off onto additional computer readable media, such as and byway of example only, CDs, diskettes, additional hard drives, and othermagnetic or optical storage devices.

[0004] During the imaging process an end user, or software applicationinitiating the imaging process, has little to no control over how theimaged data coming from an image source (e.g., hard drive of a computingdevice which is being backed up) will be structured or organized in theimaged data (e.g. external media housing the image source in formatwhich may be used to restore the image source as needed to an imagetarget).

[0005] An imaging process typically writes, in a binary format (e.g.image data), the image source onto a computer readable medium. Thiswrite process is usually performed serially, since attempting to storethe data in random access memory (“RAM”) and then further attempting toprovide some intelligent structure to the data, is not a practicaloption given the size of the image source, which could exceed severalgigabytes, or terabytes of data (especially for network or remoteimaging of client computing devices). Even if the imaged data arecompressed, before being housed in RAM, the size of the compressed datastill makes it infeasible to attempt. Moreover, trying to write toexternal media as an intermediate step and then running a process tostructure the imaged data, creates an impractical process and timeconsuming operation.

[0006] However, some minimal structure is often provided to ensure thatpartition information, or basic file attribute information (e.g.,creation date, modified date, and the like) is captured, to ensure thatthe image data created from an image source is adequately reproducibleonto some future image target.

[0007] Yet, because only minimal structure or control is given to theend user or the application initiating an image process, the ability tocontrol or limit what is imaged is minimal. Moreover, it may beextremely desirable to append files from an image source into existingimaged data, without having to perform an entire image process fromstart. Imaging can be a time and process intensive operation, which manyusers are reluctant to perform, and correspondingly users may rely onautomated processes initiated in the early morning hours, when the usersare not likely to be using their computing devices which are beingimaged.

[0008] Likewise, networked computing devices or servers which areimaged, are often imaged in the early morning hours or at scheduledtimes in which the individual end users are notified of scheduled downtimes when the imaging processes will take place. Further, some imagingprocesses may not permit other applications to run while the imaging istaking place, this is because a file which is being imaged could bemodified/altered, which may affect the overall integrity of the imagingoperation.

[0009] Accordingly, the imaging process should be controllable andcustomizable by the end user to alleviate usability/performance concernsas discussed above. To do this, the imaging process needs to becontrolled at a data file level, rather, than what is presently done inthe industry where control is performed at a partition level or thecomputer readable medium being imaged.

[0010] Moreover, end users should have the ability to view what isimaged and selectively restore items as needed, or exclude itemsaltogether from an image. This would permit users to more intelligentlycontrol the back up and restoration of their data. Further, networkadministrators could use a single image to restore multiple imagetargets by selectively restoring, on one or more computing devices, onlycertain aspects of the image data based on file sets associated withindividual computing devices.

[0011] Although some control may be provided in the creation of theimaged data, a better data format of the imaged data is needed to permitreproduction of the imaged data on target computing devices, even ifsome of these target computing devices utilize a file system differentfrom the computing device which was the source of the imaged data.

[0012] Further, a better format of the imaged data would permit moreuser control and even modify the imaged data by excluding or includingselected files included within the imaged data without compromising theintegrity of the original imaged data. Additionally, the imaged data maybe more effectively compressed with a better format and provides greaterflexible and control for future uses which may be desired by an enduser.

SUMMARY OF THE INVENTION

[0013] Accordingly, an object of the invention is to provide datastructures and methods for imaging computer readable media. During theimaging process of a computer readable medium associated with acomputing device, the data being captured from the source computerreadable medium are structured such that information regarding one ormore files residing on the source computer readable medium is easilycustomizable and restored to a target computer readable medium optimallyand as needed. A type identifier identifies the type of data associatedwith a file in the source computer readable medium and a length dataidentifies the length of a data stream associated with a least a portionof the file. One or more data streams combine to form a single completefile.

[0014] Further, a file header includes an index to a partition type andincludes one or more type identifiers along with one or more set types.A set type associates the file with logical groups, such as and by wayof example only, files to include during an image process, files toexclude during an image process, files associated with work groups(e.g., system administrators, secretaries, developers, managers, and thelike), files associated with certain data format, and others.

[0015] Once a source computer medium is imaged such that the aboveattributes are recorded, a single image may provide multiple customizedimage targets to one or more additional computing devices. Moreover,since the imaging is performed at a file level, rather, than the typicalpartition level, an interface (e.g., browser, customized windowingapplication, and the like) provides a user the ability to control theimaging process eliminating unwanted files in the image target. As oneskilled in the art will readily appreciate this provides tremendousbenefit to the user or network administrators performing the imaging.

[0016] Additional objectives, advantages and novel features of theinvention will be set forth in the description that follows and, inpart, will become apparent to those skilled in the art upon examining orpracticing the invention. The objects and advantages of the inventionmay be realized and obtained by means of the instrumentalities andcombinations particularly pointed out in the appended claims. To achievethe foregoing and other objects and in accordance with the purpose ofthe present invention, data structures and methods for imaging computerreadable media are provided.

[0017] A data structure used to recreate computer readable media isprovided having a type identifier used to identify a type of image dataoriginally imaged. Further a length data is used to identify the size ofa data stream which is at least a portion of a file located in theimaged data. Zero or more data streams are operable to be assembled toform the complete file.

[0018] In another aspect of the present invention, a method ofcollecting image data having executable instructions is provided whereincontrol data is received and associated with a file included in imagedata. The control data further includes a partition type, compressiontype, file length, and one or more file set types. Moreover, the controldata is organized on a computer readable medium together with the file.

[0019] In yet another aspect of the present invention, a method ofcustomizing the creation of an image on a computer readable medium isprovided having executable instructions, wherein zero or more files inan image source are identified as exclude files and therefore notincluded in a created image target. Furthermore, one or more files in animage source are identified as files to include in the created imagetarget.

[0020] Still other aspects of the present invention will become apparentto those skilled in the art from the following description of anexemplary embodiment, which is by way of illustration, one of theexemplary modes contemplated for carrying out the invention. As will berealized, the invention is capable of other different and obviousaspects, all without departing from the invention. Accordingly, thedrawings and descriptions are illustrative in nature and notrestrictive.

BRIEF DESCRIPTION OF THE DRAWINGS

[0021] The accompanying drawings, incorporated in and forming part ofthe specification, illustrate several aspects of the present inventionand, together with their descriptions, serve to explain the principlesof the invention. In the drawings:

[0022]FIG. 1 depicts various data structures of the present invention;

[0023]FIG. 2 depicts one method of collecting image data; and

[0024]FIG. 3 depicts one method of customizing the creation of imagedata.

DETAILED DESCRIPTION

[0025] The present invention provides data structures and methods for.One embodiment of the present invention is implemented in NOVELL'sZENworks for Desktops product using the C or C++ programming language.Of course other operating systems, and programming languages (now knownor hereafter developed) may also readily employed.

[0026] The image of hard drive data may be stored on a network diskdrive, or on an external media such as a CD, diskette, tapes, ZIPmagnetic drives, and the like. Although as one skilled in the art willappreciate, the image may be stored on any computer readable media suchas optical, magnetic, and the like. Moreover, the image need not resideon a single computer readable medium, rather, several computer readablemedia may include the data having a single image of a hard drive.

[0027] Furthermore, a computing device is any device capable orprocessing executable instructions, such as and by way of example only,personal computers, super computers, mainframe computers, mid-rangeworkstation computers, GPS systems, car computing devices, hand helddevices (e.g., Personal Digital Assistants), computing peripheraldevices (e.g., MP3 players, printers, scanners, faxes, copiers, CD-RW,ZIP drives, electronic books, electronic gaming devices, and others),mobile telecommunication devices (e.g., wireless phones), intelligentappliances, intelligent apparel (e.g., watches), and the like.

[0028] A typical imaging process will serially read the source imagedata, which may include by way of example only a hard drive of a singlecomputing device, a series of hard drives resident on multiple computingdevices, external removable computer readable media (e.g., magnetic oroptical), remote storage devices, and the like. As these data are readfrom the source image data, they are written as imaged data, typicallyin a binary format. The imaged data reside on computer readable media(e.g. magnetic, optical, and others). Again, the location of the imageddata may reside on a mirrored drive associated with the hard drive acomputing device, external/removable computer readable media, remotecomputer readable media and the like.

[0029] As the imaging process proceeds, information for each partitionof the image source data is collected. Partitions are areas on thecomputer readable media having the data, each partition may include acertain number of files or size of data, and other information which isreadily apparent to those skilled in the art. The partition informationassist in locating data logically assigned within a certain location onthe image source data. Partition information permits file systems suchas NetWare, NTFS, EXT2, FAT, BeOs, and others to operate as designed. Asone skilled in the art will readily appreciate, the partitioninformation will be structured according to the individual file systemssupported or being used by a computing device's computer readable media.

[0030] Imaged data will also include an image header which will identifythe number of partitions included within the image source data. An imageheader combined with partition information (e.g., partition headers)assist an image restore process, in creating a replica of the imagesource data when requested to do so. Yet as presented above, since thereis no control in the typical image process beyond the partitioninglevel, customization of imaged data and the migration of image sourcedata from an original partition structure to a different partitionstructure once restored are not readily achievable.

[0031] To remedy these and other short comings in the industry, theimage process is modified such that as the imaged data include fileheaders which are indexed to a particular partition header. In this way,a single file originating from an image source maybe appended to aimaged data already existing and be associated with the correctpartition in the image source. In other words, file data residing in theimage source data does not have to be sequentially stored in the imageddata. This provides obvious benefits, to a customized imaging process byallowing new files to be readily added to the imaged data with little tono user interruption and not need to initiate an entire re-imageprocess.

[0032] Moreover, file header information includes a number of additionalattributes which are of particular use to an improved imaging process.For example, file headers may include set type information. This settype information may provide information about what workgroup a userbelongs to, whether the file has been identified by a user or anautomated process as a file to be excluded or included in the imageddata, and others.

[0033] The include and exclude file type information may be based onworkgroups, such that a file is included with one workgroup but notincluded with another workgroup. As one skilled in the art will readilyappreciate this permits a single imaged data source to support thecreation of multiple image target sources on multiple computer readablemedia servicing multiple computing devices, based on file set types.This provides tremendous customization capabilities for networkadministrators or individual end users by using the imaged source of thepresent invention during a restore process.

[0034] Moreover, data associated with a file may be further broken intoparts following a file header within the imaged data, such that the sumof the individual parts form a single complete file. The parts may eachinclude a header which identifies its data type, a data type by way ofexample only, may include data links, directory names, raw data,compressed data, uncompressed data, and others. Moreover, a flag may beused within the part header to indicate the type of data compression onthe data which follows. Although, as one skilled in the art will readilyappreciate, compression need not occur at all.

[0035] Further, the part header will indicate the length of the datathat follows the part header, the length may include both theuncompressed length of the data that follows and/or the length of thecompressed data which follows. After the length data, the data willfollow. In this way the structure of the imaged source may appear asdepicted in FIG. 1.

[0036]FIG. 1 depicts various data structures of the present invention.An entire imaged data 10 includes a beginning image header 20, one ormore partition headers 30 and 40, one or more file headers 70 and 140,one or more data type headers 80 and 110, one or more length data 90 and120, and one or more data parts 100 and 130.

[0037] An image header 20 may include information regarding the numberof partitions. Following the image header 20 are one or more partitionheaders Part₀ 30 and Part_(n-1) 40. Each partition header includespartition information describing the types of partition included in theoriginal image source data. It may also include, by way of example only,the original cluster size, the total number of sectors in the originalpartition, the number of used sectors in the originating partition,original and minimum size of the original source partition, and thelike.

[0038] Optimally, as the imaging process proceeds partition headers, asidentified in the image source data, are indexed and retained in RAM ina data structure controlled by a set of executable instructions, such asby way of example only, an array where the array location is the indexvalue, and each element of the array is a data structure having thepartition header information, as described above.

[0039] Next, following the partition headers are one or more fileheaders, such as a first encountered file header File₀ 70 continuinguntil a last file header File_(y-1) 140 is encountered. File headers areassociated with data in the image source, different types of data may bedefined, such as and by way of example only, data links, compresseddata, uncompressed data, directory names, and the like. A file headerincludes a partition index such that it may be associated with aparticular partition header. This permits file headers and file data tobe appended to imaged data without having to recreate an new imaged dataeach time a file is added on an image source.

[0040] As discussed above, file headers also include set typeinformation. This permits files to have attribute information beyondwhat is normally retained within imaged data. A single file may includeone to many set types, such that the file may be easily identified as afile to include in an image target for a specified workgroup (e.g.,system administrator, manager, developer, secretary, and the like), oralternatively the same file may have set types which identify it as afile not to include in an image target for a different workgroup. As oneskilled in the art will appreciate, this permits a single image datasource to be used to restore multiple variant versions of image targetsources on demand.

[0041] Following the file headers, a first type header Type₀₁ 80identifies the type of data associated with the file being imaged fromthe image source. Data types may be any of those enumerated above, aswell as others. A file header may be followed by a single type header ormultiple file headers, until and ending type header Type_(0x) 110 isencountered. Next, length₀₁ data 90 follows Type₀₁ 80 and endinglength_(ox) data 120 follows ending Type_(0x) 110. As discussed abovethe length data may include the length of the compressed data whichfollows it and/or the length of the data that follows it in anuncompressed form.

[0042] Optionally, a compression indicator may be used to identify thetype of compression data associated with a file is in, in this way anappropriate decompression algorithm may be used to properly restore theimaged data when needed. As one skilled in the art will appreciatelength data are useful in validating image target data and in parsingthe imaged data during restore operations to an image target data.

[0043] Following the length data are one or more file data, these unitsof data when assembled form a complete file data associated with a fileheader. For example, data₀₁ 100 follows length_(o1) 90, and an endingdata unit data_(0x) 130 follows an ending length data length_(0x) 120.Breaking the file data up as depicted with 60 into units permits, moregranular control over the file and its storage within the imaged source.For example the imaged source may be optimally compressed in a varietyof compressed form, such as zip, or any ad hoc developed compressiontechnique.

[0044] Moreover, information regarding a single file is depicted as unit50 in FIG. 1, and this permits granular control at the file level withinthe imaged data. As previously presented, this allows single files to beappended onto the end of the imaged data without the need to re-imagethe entire image source data, since the file may be readily associatedwith a particular partition from the image source data.

[0045] Further, as will be apparent to one skilled in the art by 150 ofFIG. 1, file units 50 may be repeated as many times as needed, until anend of file is encountered within the image source data. Additionally,with the partition information retained conversion or translationutilities may be deployed wherein a single image source stored as imageddata 10 may be populated on a restore operation to one or more imagetargets each target having a different partition.

[0046] For example, an image source associated with a NetWare filesystem and corresponding partition information may be stored utilizingthe data structures depicted in FIG. 1 to form imaged data 10. Theimaged data 10 may then be used to restore on a computer readable mediumusing the NTFS file system and having different file partitioninformation by using a translation algorithm. As one skilled in the artwill readily appreciate, this provides tremendous benefits by allowingimaged data to be partitioned on multiple file systems.

[0047] Moreover, the entire imaged data 10 may be stored in traditionalbinary format or in extensible markup language (XML), unicodeinternational format, universal disk format (UDF), and others. Bystoring image data in a universal format it may be readily interfacedwith standard browsers or other windowing interfaces permitting networkadministrators or users to modify file types, view the image source andfurther control and customize, browse, view, and store the image data.This provides tremendous control for the end user and tremendousflexibility in uniquely restoring the imaged data to one or more imagedtargets.

[0048]FIG. 2 depicts one method of collecting image data. Initially instep 170, the image source data is parsed with an image header createdin step 160 and one or more partition headers created in step 180. Next,as files are encountered in the image source, file headers are createdin step 200. When a file header is created it is associated with apartition header in step 190. As previously presented, partition headersare uniquely indexed and this information is readily available toassociate and store within the created file headers.

[0049] Creating file headers continues until and an end of file (“EOF”)is detected in the image source, although in the present invention filesmay be individually appended onto the created image data since fileheaders may be uniquely associated with partition headers using thepartition index value. For each encountered file, the file data type isidentified in step 220 and if present a compression type is identifiedin step 240.

[0050] Also, as previously discussed, file headers are associated withone or more file sets, which permit files to be included, excluded, andbelong to user defined workgroups. The set information may also permitpermissions to control who may or may not perform a restore operation onthe file, as well as other restrictions or customizations.

[0051] Next a length data is created in step 250 and it is derived fromthe raw file data chunk in step 260. As previously presented the filedata may include one or more data chunks, and the process of steps 220through 260 continues to iterate until a new file is detected or an EOFis detected. Once a file header and the series of type, length, and datachunk information is collected for the entire image source a data streamis created in step 230.

[0052] The image header created in step 160 and the partition headerscreated in step 180, combine with the data stream created in step 230 toform an imaged data which may then be used by a set of executableinstructions to perform a restore operation in step 280 wherein therestore operation may populate one or more image target sources. Theimage target sources may have different partition information (e.g.,file systems, and the like) from the image source. And, the image targetsources may be exist on different media from the original image source(e.g., hard drives, magnetic computer readable media, optical computerreadable media, one or more logically associated computer readablemedia, remote computer readable media, and the like).

[0053]FIG. 3 depicts one method of customizing the creation of imagedata. Initially, an image source is iterated in step 290, and as it isiterated files are identified in step 300 as exclude files while in step320 files may be identified as include files.

[0054] Identification may be automatic or manual selection by a user,utilizing a browser or any other interface device to identify during theimage process that files belong to different types of sets. Moreover, aspresented above a single file may have multiple set information, suchthat a single file may be excluded for some users upon a restoreoperation of the imaged data to an image target but include for otherusers.

[0055] Moreover, the type of partition, such as a Partition A, isidentified and associated with the image source in step 310. Thisinformation will be retained and permits translation/conversionutilities to be developed where the imaged data created from an imagesource may be translated to a different partition type.

[0056] Set type information is collected in step 330, and a single imagetarget or multiple image targets may be created in step 350 using arestore operation on the imaged data collected from iterating the imagesource in step 290. Further, an image target may have the same partitionas the original image source as depicted in step 340, or alternativelyan image target may have a partition different from the original imagesource as depicted in step 360.

[0057] Furthermore, customized image target sources may be derived fromthe imaged data in step 370, and stored on different computer readablemedia as depicted in steps 380 and 390.

[0058] As one skilled in the art will readily appreciate, the presentinvention permits tremendous benefits over existing imaging processesand restoration processes associated with backing up data stored oncomputer readable media. It also permits the user to have greatercontrol and flexibility over the process and the eventual restoration.Other benefits will also be readily apparent to those skilled in theart.

[0059] The foregoing description of an exemplary embodiment of theinvention has been presented for purposes of illustration anddescription. It is not intended to be exhaustive nor to limit theinvention to the precise form disclosed. Many alternatives,modifications, and variations will be apparent to those skilled in theart in light of the above teaching. Accordingly, this invention isintended to embrace all alternatives, modifications, and variations thatfall within the spirit and broad scope of the attached claims.

What is claimed:
 1. A data structure used to recreate computer readablemedia, comprising: a type identifier used to identify a type of imagedata originally imaged; and a length data used to identify the size of adata stream which is at least a portion of a file which is associatedwith the image data, wherein zero or more additional data streams areoperable to be assembled to form the complete file.
 2. The datastructure of claim 1, further comprising: an image header identifying anumber of partitions on the image data.
 3. The data structure of claim2, further comprising: one or more partition headers each identified bya unique index number;
 4. The data structure of claim 3, furthercomprising: a file header having a specific unique index numberidentifying a specific partition header; and one or more file setidentifiers identifying one or more file associations.
 5. The datastructure of claim 1, wherein the data structure is stored in a binaryformat.
 6. The data structure of claim 5, wherein the data structure isstored as extensible markup language, unicode international format, oruniversal disk format.
 7. The data structure of claim 1, wherein thetype identifier is associated with a data link, compressed data, oruncompressed data.
 8. The data structure of claim 1, wherein the datastream originated from at least one of the following environmentsNetWare, NTFS, EXT2, FAT, and BeOs.
 9. A method of collecting imagedata, comprising the executable instructions of: receiving control dataassociated with a file wherein the file is included in image data, andthe control data includes a partition type, compression type, filelength, and one or more file set types; and organizing the control dataon a computer readable medium, along with the file.
 10. The method ofclaim 9, wherein the control data and the file are stored in a universalformat on the computer readable medium.
 11. The method of claim 9,wherein the control data and the file are received from a hard drive ofa computing device.
 12. The method of claim 9, further comprising:restoring the file to a second computer readable medium using thecontrol data.
 13. The method of claim 12, wherein the second computerreadable medium has a second partition type different from the partitiontype.
 14. The method of claim 9, wherein the file set types include atleast one of include sets, delete sets, and work group sets.
 15. Themethod of claim 9, wherein control data and the file may be disregardedby a user during a restore operation if the file is identified by theuser as being unnecessary.
 16. A method of customizing the creation ofan image on computer readable medium, comprising the executableinstructions of: identifying zero or more exclude files in an imagesource as files not to include in a created image target; andidentifying one or more include files in an image source which areincluded in the image target.
 17. The method of claim 16, furthercomprising: identifying a first partition type of the image source andpermitting a second partition type different from the first partitiontype to exist on the image target.
 18. The method of claim 16, furthercomprising: receiving one or more set types associated with one or moreof the include files.
 19. The method of claim 18, further comprising:creating one or more customized image targets based on one or more ofthe set types from the image target.
 20. The method of claim 19, whereinone or more of the customized targets are created on one or morecomputer readable media used by one or more computing devices.